Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make spawn an option again #15379

Merged
merged 2 commits into from
Jun 16, 2017

Conversation

jrafanie
Copy link
Member

@jrafanie jrafanie commented Jun 14, 2017

This is nearly a no-op for the existing forking setup, but allows you to spawn workers instead via an environment variable.

We had to add some complexity for subclasses that implement start_runner because we don't want to bypass their implementation.

EDIT: Embedded ansible worker was changed to implement both of the
start_runner fork/spawn methods to ensure the threaded implementation is
used.

MIQ_SPAWN_WORKERS=true bin/rake evm:start => Kernel.spawn
MIQ_SPAWN_WORKERS=false bin/rake evm:start => fork
bin/rake evm:start => fork

Note, when using @NickLaMuro's singular_worker script, the second commit is needed to avoid having millions of worker rows.

@jrafanie jrafanie requested review from gtanzillo and NickLaMuro June 14, 2017 22:07
Copy link
Member

@NickLaMuro NickLaMuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the very least, I would like to see some comments surrounding the "why" for the EmbeddedAnsible worker.

I see the work around that you did here as a step back from what we are trying to accomplish here, and if it #hadToBeThisWay™, at the very least, some more explanation surrounding the why would be nice. Couldn't find anything meaningful in the git history or on github.

@@ -345,9 +345,32 @@ def start_runner
Process.detach(pid)
pid
end
alias_method :start_runner_via_fork, :start_runner
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last commit that changes this and the code around is confusing and I don't understand why it is needed or how it works. Seems like we should change the caller instead of jumping through hoops in the interface to make this work, but I admit that I still am unsure what you are fixing/trying to accomplish in that commit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that I have figured out the "why" for this specific work around, but I still don't like what we are doing.

Instead of the embedded ansible worker being a separate process like everything else, it is a Thread off the main MiqServer::Monitor process. That seems like it is all sorts of wrong, and doesn't translate to a containerized environment at all.

I can't find any rational as to why we (well... you) made this worker behave so differently, just that it was done (can't even find a PR for it, just a commit: 18ffca9 ).

Regardless of why the worker is like this, the code for this runner seems to be doing complete against a forking/spawning model, meaning it doesn't do either.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you're right. It's different. It will need to be changed with re-arch. The idea was that we didn't actually need to have a process to monitor another process so it was done in a thread. start_runner was supposed to be for subclasses to implement in any way those wanted, spawn, fork, thread, etc.

Let me review this again in the morning to see if we can do this cleaner. I agree, it's messy.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I'm not sure what happened to that commit, let me follow the bouncing ball...

@@ -52,7 +56,12 @@
worker_class = worker_class.constantize
worker_class.before_fork
unless options[:dry_run]
worker = worker_class.create_worker_record
worker = if options[:guid]
worker_class.find_by(:guid => options[:guid])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I wrote this, I can be critical of it ;)

Maybe we should do a find_by!, or something similar so that we error out if we can't find the worker record.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haha, you can be critical of other things too... Otherwise, what's the point? I know you're kidding but wanted to make sure you were "fine" with throwing 🍅

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and yes, I'll look at changing it

@chessbyte chessbyte requested a review from Fryguy June 15, 2017 15:53
@jrafanie jrafanie force-pushed the make_spawn_an_option_again branch 3 times, most recently from 224c9ca to 8ecf792 Compare June 15, 2017 20:03
@@ -18,6 +19,7 @@ def start_runner
end
end
end
alias_method :start_runner_via_spawn, :start_runner_via_fork
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@carbonin please review this part. It's kinda icky that this worker needs to know that it has to override both of these methods with a threaded runner but it seems like the simplest solution to get this feature on master

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not an excuse to not include @carbonin 's input, but my two cents for how this should be done is leave start_runner as the method that is overwritten by this class, and slightly change up how it is implemented in the main MiqWorker class:

  def start_runner
+   ENV['MIQ_SPAWN_WORKERS'].to_s.downcase == "true" ? start_runner_via_spawn : start_runner_via_fork
+ end
+
+ def start_runner_via_fork
    self.class.before_fork
    pid = fork(:cow_friendly => true) do
      self.class.after_fork
 @@ -346,8 +346,25 @@ def start_runner
    pid
  end
  
+ def self.build_command_line(guid)
+   command_line = "#{Gem.ruby} #{runner_script} --heartbeat --guid=#{guid} #{self.name}"
+   ENV['APPLIANCE'] ? "nice #{nice_increment} #{command_line}" : command_line
+ end
+
+ def self.runner_script
+   script = ManageIQ.root.join("lib/workers/bin/run_single_worker.rb")
+   raise "script not found: #{script}" unless File.exist?(script)
+   script
+ end
+
+ def start_runner_via_spawn
+   self.pid = Kernel.spawn(self.class.build_command_line(guid), :out => "/dev/null", :err => [Rails.root.join("log", "evm.log"), "a"])
+   Process.detach(pid)
+   pid
+ end
+ 
  def start
   self.pid = start_runner

This way, any changes that are necessary to the normal workflow are done in the edgecase classes, and attempted to be accounted for in the base class.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great idea, I like that better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it too...minor tweak to it:

ENV['MIQ_SPAWN_WORKERS'] ? start_runner_via_spawn : start_runner_via_fork

You don't need to check for an explicit value in the ENV var, just its presence.

def start
self.pid = start_runner
self.pid = ENV['MIQ_SPAWN_WORKERS'].to_s.downcase == "true" ? start_runner_via_spawn : start_runner_via_fork
Copy link
Member

@Fryguy Fryguy Jun 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't see you had implemented it already, but you can drop the .to_s.downcase == "true" bit...it's not necessary

@jrafanie jrafanie force-pushed the make_spawn_an_option_again branch from 8ecf792 to a3e0587 Compare June 16, 2017 17:25
jrafanie and others added 2 commits June 16, 2017 13:46
Use run_single_worker script for worker spawning

Embedded ansible worker overrides start_runner to implement
a thread based worker implementation.
When running single workers in the context of a worker monitor, the
worker records are usually already created by the monitor process by the
time the worker is forked/spawned.

Allowing the lib/worker/bin/run_single_worker.rb script to pass a guid
allows it to use those records instead of creating new ones.  If this
option isn't used, the already created ones exist along side the other
worker records, and cause evm:stop to not function properly.
@jrafanie jrafanie force-pushed the make_spawn_an_option_again branch from a3e0587 to 3f3a823 Compare June 16, 2017 17:46
@jrafanie
Copy link
Member Author

Ok, fixed the cops and addressed all the things. Thanks for the suggestions @NickLaMuro and @Fryguy

@miq-bot
Copy link
Member

miq-bot commented Jun 16, 2017

Checked commits jrafanie/manageiq@6d3362f~...3f3a823 with ruby 2.2.6, rubocop 0.47.1, and haml-lint 0.20.0
5 files checked, 0 offenses detected
Everything looks fine. 🏆

@gtanzillo gtanzillo merged commit 8729c6e into ManageIQ:master Jun 16, 2017
@jrafanie jrafanie deleted the make_spawn_an_option_again branch June 16, 2017 18:22
@chessbyte chessbyte added this to the Sprint 63 Ending Jun 19, 2017 milestone Aug 15, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants